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DETAILED ACTION 

Response to Arguments 

1 . Applicant's arguments filed 6/1 8/201 0 have been fully considered but they are 
not persuasive. 

2. In re pages 1 2-1 3, the applicants present the sole argument that Nagasaka fails 
to teach the new limitation of "a plurality of static image data and each static image data 
represents scenes in the video data having variable time width for the portion of video 
data represented in the scene". It is argued that Fig. 7 of Nagasaka merely collects a 
variable number of shot-representative pictures, and not a collection of pictures 
representing portions of video having variable time widths. 

3. In response, the examiner respectfully disagrees. Col. 9, line 8 through Col. 10, 
line 67 discusses the digest making process wherein the shot-representative pictures 
are extracted when a new scene/shot is determined. Figure 7 displays a result of such 
wherein over several 15 minute intervals between 1 1 :00 and 12:00, various different 
numbers of shot-representative pictures have been extracted, e.g., between 1 1 :00 and 
1 1 :15, 4 shot-representative pictures have been extracted. Similarly, within the next 
three 15 minute intervals a different number of 2, 7 and 3 shot-representative pictures 
have been extracted. The created digest includes a different number of shot- 
representative pictures over a 15 minute interval. Col. 8, lines 19-36 teaches wherein 
the shot-representative pictures are used to display a digest, from which, upon selection 
of a shot-representative pictures by a user, reproduction from the position where the 
shot-representative picture was created is commenced. The shot-representative 
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pictures displayed represent the scenes in the video data with varying time widths since 
for similar 15 minute intervals there exists varying number of shot-representative 
pictures. Additionally, as admitted to by the applicants, the shot-representative pictures 
are not created on a fixed interval. Therefore, the shot-representative pictures, when 
presented as a digest for reproduction, represent the scenes in the video data with 
variable time widths. For further discussion, e.g., a first 15 minute interval has two shot- 
representative pictures and a second 15 minute interval has seven shot-representative 
pictures. Therefore, in the first 15 minute interval, the two shot-representative pictures 
(first group) can represent two scenes (whose width varies with the shot detection). 
Similarly, in the second 15 minute interval, the seven shot-representative pictures 
(second group) can represent seven scenes (whose width varies with the shot 
detection). Therefore, it is at least clear that the first group of shot-representative 
pictures represent scenes in the video data having varying widths that are at least larger 
than compared to the second group of shot-representative pictures. 

Claim Rejections - 35 USC § 101 
4. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

The USPTO "Interim Guidelines for Examination of Patent Applications for Patent 
Subject Matter Eligibility" (Official Gazette notice of 22 November 2005), Annex IV, 
reads as follows: 



Claims that recite nothing but the physical characteristics of a form or energy, such as a frequency, 
voltage, or the strength of a magnetic field, define energy or magnetism, per se, and as such are 
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nonstatutory natural phenomena. O'Reilly, 56 U.S. (15 How.) at 112-14. Moreover, it does not appear 
that a claim reciting a signal encoded with functional descriptive material falls within any of the 
categories of patentable subject matter set forth in Sec. 101 . 

... a signal does not fall within one of four statutory classes of 101. 

. . . signal claims are ineligible for patent protection because they do not fall within any of the four 
statutory classes of Sec. 101 . 

Claims 5-8 and 16-19 are rejected under 35 U.S.C. 101 as not falling within one 
of the four statutory categories of invention. Supreme Court precedent and recent 
Federal Circuit decisions indicate that a statutory "process" under 35 U.S.C. 101 must 
(1) be tied to another statutory category (such as a particular apparatus), or (2) 
transform underlying subject matter (such as an article or material) to a different state or 
thing. While the instant claims recite a series of steps or acts to be performed, the 
claims neither transform underlying subject matter nor positively tie to another statutory 
category that accomplishes the claimed method steps, and therefore do not qualify as a 
statutory process. For example, an image reduction process comprising (1) extracting, 
obtaining, and reproducing steps, (2) extracting, obtaining, requesting and reproducing 
steps, or (3) extracting, associating, accepting and reproducing steps is of sufficient 
breadth that it would be reasonably interpreted as a series of steps completely 
performed mentally, verbally or without a machine. 

Furthermore, claims 9-12, 20, 28-31 are rejected under 35 U.S.C. 101 because 
the claimed invention is directed to non-statutory subject matter as follows. Regarding 
claims 9-12, 20, 28-31, in the state of the art, transitory signals are commonplace as a 
medium for transmitting computer instruction and thus, in the absence of any evidence 
to the contrary and give the broadest reasonable interpretation, the scope of a "data 
storage medium" covers a signal per se. Therefore, it is suggested by the examiner that 



Application/Control Number: 1 0/661 ,489 Page 5 

Art Unit: 2621 

the applicants amend the claims to additionally recite that the data storage medium is of 
a "non-transitory" type. 



Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 1-31 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Saunders et al. (US 2006/0288113) in view of Nagasaka (US 5,974,218). 

Regarding claim 1, Saunders et al. teaches an image reproduction system that 
reproduces static image data synchronously with reproduction of video data, 
comprising: 

a position information obtainment unit that obtains a reproduction time position of 
the video data as the video data is reproduced (Paragraph 42 teaches wherein a 
content author can determine the rendering time for a video component of the entire 
presentation. Fig. 5 further shows where Video 502 is synchronized along with other 
media samples/data. Paragraph 44 teaches that rendering times for each video 
sequences 714 are stored by the format writer 716 as part of the presentation. 
Paragraphs 54-57 teaches a renderer that uses a browser 758 or a multimedia player 
760 that receives the presentation and reproduces according to the rendering times set 
in the presentation); 
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an image obtainment unit that obtains extracted static image data associated in 
advance with the obtained reproduction time position (Paragraph 42 teaches wherein a 
content author can determine the rendering time for each media sample ("HTML, 
image") of the entire presentation. Fig. 5 further shows where media samples within 
Banner 504, Slides 506 are synchronized along with other media samples/data. 
Paragraph 44 teaches that rendering times for media samples are stored by the format 
writer 716 as part of the presentation. Paragraphs 54-57 teaches a renderer that uses a 
browser 758 or a multimedia player 760 that receives the presentation and reproduces 
according to the rendering times set in the presentation); and 

an image reproduction unit that reproduces the obtained static image data 
synchronously with the video data (Fig. 5 and Paragraphs 51-53 teaches where a client 
access a presentation which is reproduced according to the rendering times set by the 
user as discussed above). 

However, Saunders fails to particularly teach a preprocessing unit that extracts 
static image from the video data by an operator operation that performs the setting 
operation while viewing the data before a disposition registration of the video data is 
initiated, wherein the preprocessing unit extracts a plurality of still image data and each 
static image data represents scenes in the video data having variable time width for the 
portion of video data represented in the scene : 

In an analogous art, Nagasaka et al. teaches in col. 3, lines 43-48, col. 3, line 56 
through col. 4, line 35, col. 9, line 9 through col. 10, line 67 wherein a broadcast 
program is programmed to be recorded by a user. During the recording of the broadcast 
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program a digest picture is extracted from the video program. The extraction of shot- 
representative pictures are extracted based on a detection of a new scene/shot. The 
scene/shot length can vary from one scene to the next according to the broadcast 
program and hence the extraction of the static image (generation of shot-representative 
pictures) is based on a variable time-width. Furthermore, col. 8, lines 19-36 teaches 
wherein the shot-representative pictures are used to display a digest, from which, upon 
selection of a shot-representative pictures by a user, reproduction from the position 
where the shot-representative picture was created is commenced. The shot- 
representative pictures displayed represent the scenes in the video data with varying 
time widths since for similar 1 5 minute intervals there exists varying number of shot- 
representative pictures. Additionally, the shot-representative pictures are not created on 
a fixed interval. Therefore, the shot-representative pictures, when presented as a digest 
for reproduction, represent the scenes in the video data with variable time widths. For 
further discussion, e.g., a first 15 minute interval has two shot-representative pictures 
and a second 15 minute interval has seven shot-representative pictures. Therefore, in 
the first 15 minute interval, the two shot-representative pictures (first group) can 
represent two scenes (whose width varies with the shot detection). Similarly, in the 
second 15 minute interval, the seven shot-representative pictures (second group) can 
represent seven scenes (whose width varies with the shot detection). Therefore, it is at 
least clear that the first group of shot-representative pictures represent scenes in the 
video data having varying widths that are at least larger than compared to the second 
group of shot-representative pictures. 
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The system of Saunders et al. can be modified to allow for capturing and storing 
of the still image files, as taught by Nagasaka so that the extracted still images can be 
used for the media presentation file of Saunders et al. Thereafter, the still images of 
Nagasaka can be added to the existing presentation to finalize a presentation as 
desired. The still images are extracted even before the corresponding broadcast 
program has completed recording, therefore, the system of Saunders et al. can not 
utilize the extracted digest pictures or the associated video. Therefore, the extraction of 
the digest picture occurs before a disposition registration can even begin. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to incorporate the ability to extract a still image from a video data 
stream as taught by Nagasaka into the system taught by Saunders et al., because such 
incorporation would allow a user in Saunders et al. to seize or grasp the content and 
composition of a video speedily (Nagasaka in col. 3, lines 27-31). 

Regarding claim 2, Saunders et al. teaches an image reproduction system that 
reproduces static image data synchronously with reproduction of video data, 
comprising: 

a delivery server that holds the video data and static image data associated with 
the video data (Paragraph 0042 teaches of at least "video component stream 502" and 
"slides component stream 508" (still images) are "delivered from the media server to the 
client in a synchronous manner to form a complete presentation". Additionally, 
paragraph 0042 teaches "a content author obtains or creates web content data and/or 
media data such as ... (e.g. HTML, image), audio and/or video represented by arrow 
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704, which is used to created a presentation file/stream. Paragraph 49 teaches that "the 
media server 742 may receive the presentation stream 744 or the presentation file 746". 
Since the presentation file 746 is created using the at least "video component stream 
502" and "slides component stream 508", the media server holds the video and static 
image data associated with the video data); and 

a browsing client that reproduces and displays on a screen the video data and 
static image data provided by the delivery server (paragraphs 51-53, client browses a 
presentation "after receiving the playback request from the client, the media server 742 
delivers the presentation stream 744 or presentation file 746 to the client" (first line of 
paragraph 0051 ). The rendering in Saunders means is rendered for display (paragraph 
0031 : "Rendering 1 sample may trigger the rendering or display of these media samples 
altogether") on browser 758 or the multimedia player 760. The display on the client 
device is taught in paragraph 0072, wherein "In operation, a client uses a computer 
such as a .... In addition, the format reader 764 executes instructions to retrieve an 
audio component stream 102, a video component stream 104, and a script component 
stream 108 from the presentation stream 754 or presentation file 756 and to deliver 
these component streams to a browser 758 or multimedia player 760." ), 

wherein the browsing client comprises: 

a position information obtainment unit that obtains a reproduction time position of 
the video data as the video data is reproduced (paragraphs 54-57 teaches a renderer 
that uses a browser 758 or a multimedia player 760 that receives the presentation and 
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reproduces according to the rendering times set in the presentation (as discussed in 
claim 1 above)); 

an image request unit that makes a request to the delivery server for the static 
image data associated in advance with the reproduction time position (paragraphs 51- 
53 teaches where a user on a client machine requests a particular presentation to be 
accessed/viewed. The presentation includes the still images within media samples); and 

an image reproduction unit that reproduces the static image data synchronously 
with the video data, the static image data being provided by the delivery server in 
response to the request (paragraphs 56-57 teaches where a presentation, which 
includes video 502 and images stored by itself or within Banners 504 and/or Slides 506 
are reproduced in synchronism). 

However, Saunders fails to particularly teach a preprocessing unit that extracts 
static image from the video data by an operator operation that performs the setting 
operation while viewing the data before a disposition registration of the video data is 
initiated, wherein the preprocessing unit extracts a plurality of still image data and each 
static image data represents scenes in the video data having variable time width for the 
portion of video data represented in the scene ; 

In an analogous art, Nagasaka et al. teaches in col. 3, lines 43-48, col. 3, line 56 
through col. 4, line 35, col. 9, line 9 through col. 10, line 67 wherein a broadcast 
program is programmed to be recorded by a user. During the recording of the broadcast 
program a digest picture is extracted from the video program. The extraction of shot- 
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representative pictures are extracted based on a detection of a new scene/shot. The 
scene/shot length can vary from one scene to the next according to the broadcast 
program and hence the extraction of the static image (generation of shot-representative 
pictures) is based on a variable time-width. Furthermore, col. 8, lines 19-36 teaches 
wherein the shot-representative pictures are used to display a digest, from which, upon 
selection of a shot-representative pictures by a user, reproduction from the position 
where the shot-representative picture was created is commenced. The shot- 
representative pictures displayed represent the scenes in the video data with varying 
time widths since for similar 1 5 minute intervals there exists varying number of shot- 
representative pictures. Additionally, the shot-representative pictures are not created on 
a fixed interval. Therefore, the shot-representative pictures, when presented as a digest 
for reproduction, represent the scenes in the video data with variable time widths. For 
further discussion, e.g., a first 15 minute interval has two shot-representative pictures 
and a second 15 minute interval has seven shot-representative pictures. Therefore, in 
the first 15 minute interval, the two shot-representative pictures (first group) can 
represent two scenes (whose width varies with the shot detection). Similarly, in the 
second 15 minute interval, the seven shot-representative pictures (second group) can 
represent seven scenes (whose width varies with the shot detection). Therefore, it is at 
least clear that the first group of shot-representative pictures represent scenes in the 
video data having varying widths that are at least larger than compared to the second 
group of shot-representative pictures. 
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The system of Saunders et al. can be modified to allow for capturing and storing 
of the still image files, as taught by Nagasaka so that the extracted still images can be 
used for the media presentation file of Saunders et al. Thereafter, the still images of 
Nagasaka can be added to the existing presentation to finalize a presentation as 
desired. The still images are extracted even before the corresponding broadcast 
program has completed recording, therefore, the system of Saunders et al. can not 
utilize the extracted digest pictures or the associated video. Therefore, the extraction of 
the digest picture occurs before a disposition registration can even begin. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to incorporate the ability to extract a still image from a video data 
stream as taught by Nagasaka into the system taught by Saunders et al., because such 
incorporation would allow a user in Saunders et al. to seize or grasp the content and 
composition of a video speedily (Nagasaka in col. 3, lines 27-31). 

Regarding claim 3, the proposed combination of Saunders et al. and Nagasaka 
teaches the claimed as discussed above in claim 1 , and furthermore, Saunders et al. 
teaches the claimed further comprising: 

a specification unit that accepts reproduction time position information of the 
video data from a user's input (as discussed in claim 1 above, wherein a content author 
can set rendering times for video sequences 502, 714, and the plethora of media 
samples including still images); and 
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a video reproduction unit that reproduces the video data from a time position 
corresponding to the accepted reproduction time position information (as discussed in 
claim 1 above), 

wherein the position information obtainment unit obtains time position information 
specified by the user's input (as discussed in claim 1 above, wherein a content author 
can set rendering times for video sequences 502, 714, and the plethora of media 
samples including still images). 

Regarding claim 4, Saunders et al. teaches an image reproduction system that 
reproduces video data and plural pieces of static image data in association with each 
other, comprising: 

a specification unit that accepts a command provided by a user's input to select 
one piece of static image data from the plural pieces of static image data; (paragraphs 
36-37 teaches wherein a client requests to seek to a particular point in the presentation. 
The client has the ability to seek to a particular point, including the locations of still 
images stored in banner 504 and/or slides 506); and 

a video reproduction unit that reproduces the video data from a reproduction time 
position with which the selected piece of static image data is associated (as discussed 
above, after the client has chosen a particular still image, i.e. a particular location, the 
presentation resumes from that particular location). 

However, Saunders fails to particularly teach a preprocessing unit that extracts 
static image from the video data by an operator operation that performs the setting 
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operation while viewing the data before a disposition registration of the video data is 
initiated, wherein the preprocessing unit extracts a plurality of still image data and each 
static image data represents scenes in the video data having variable time width for the 
portion of video data represented in the scene : 

In an analogous art, Nagasaka et al. teaches in col. 3, lines 43-48, col. 3, line 56 
through col. 4, line 35, col. 9, line 9 through col. 10, line 67 wherein a broadcast 
program is programmed to be recorded by a user. During the recording of the broadcast 
program a digest picture is extracted from the video program. The extraction of shot- 
representative pictures are extracted based on a detection of a new scene/shot. The 
scene/shot length can vary from one scene to the next according to the broadcast 
program and hence the extraction of the static image (generation of shot-representative 
pictures) is based on a variable time-width. Furthermore, col. 8, lines 19-36 teaches 
wherein the shot-representative pictures are used to display a digest, from which, upon 
selection of a shot-representative pictures by a user, reproduction from the position 
where the shot-representative picture was created is commenced. The shot- 
representative pictures displayed represent the scenes in the video data with varying 
time widths since for similar 1 5 minute intervals there exists varying number of shot- 
representative pictures. Additionally, the shot-representative pictures are not created on 
a fixed interval. Therefore, the shot-representative pictures, when presented as a digest 
for reproduction, represent the scenes in the video data with variable time widths. For 
further discussion, e.g., a first 15 minute interval has two shot-representative pictures 
and a second 15 minute interval has seven shot-representative pictures. Therefore, in 
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the first 15 minute interval, the two shot-representative pictures (first group) can 
represent two scenes (whose width varies with the shot detection). Similarly, in the 
second 15 minute interval, the seven shot-representative pictures (second group) can 
represent seven scenes (whose width varies with the shot detection). Therefore, it is at 
least clear that the first group of shot-representative pictures represent scenes in the 
video data having varying widths that are at least larger than compared to the second 
group of shot-representative pictures. 

The system of Saunders et al. can be modified to allow for capturing and storing 
of the still image files, as taught by Nagasaka so that the extracted still images can be 
used for the media presentation file of Saunders et al. Thereafter, the still images of 
Nagasaka can be added to the existing presentation to finalize a presentation as 
desired. The still images are extracted even before the corresponding broadcast 
program has completed recording, therefore, the system of Saunders et al. can not 
utilize the extracted digest pictures or the associated video. Therefore, the extraction of 
the digest picture occurs before a disposition registration can even begin. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to incorporate the ability to extract a still image from a video data 
stream as taught by Nagasaka into the system taught by Saunders et al., because such 
incorporation would allow a user in Saunders et al. to seize or grasp the content and 
composition of a video speedily (Nagasaka in col. 3, lines 27-31). 
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Method claims 5-7 and medium claims 9-11 are rejected for the same reasons 
as discussed in claim 2 above. The limitations in claims 5-7 and 9-1 1 are broader than 
the limitations in claim 2. 

Method claims 8 and medium claim 12 are rejected for the same reasons as 
discussed in claim 4 above. The limitations in claims 8 and 12 are broader than the 
limitations in claim 4 above. 

Regarding claims 13-20, the proposed combination of Saunders et al. and 
Nagasaka teaches the claimed as discussed in claims 1-2 and 4-12 above, and 
furthermore, Saunders et al. teaches the claimed further comprising a retrieval interface 
(Fig. 7B and paragraph 49 teaches of a Client Site that retrieves a particular 
Presentation File) including a keyword input part that matches keyword input with 
contents data associated with the image data to retrieve results (Fig. 7B and paragraph 
49 teaches of a Client Site that retrieves a particular Presentation File. In order to 
request a particular Presentation File, information to differentiate one Presentation File 
from another is inherently input by the user. See paragraph 3 above for further 
discussion). 

Regarding claims 21-31, the limitations are met since the broadcast program 
recorded in Nagasaka consists of a plurality of frames. Col. 3, lines 43-48, col. 3, line 56 
through col. 4, line 35, col. 9, line 9 through col. 10, line 67, wherein col. 4, line 4 
teaches "a detecting unit for fetching the frame based picture signal and for detecting or 
representative pictures those frames which correspond to the inter-shot transitions of 
the television program, respectively, a storage unit for storing the digest picture 
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constituted by a set of the representative pictures The frame that is detected and 
later used for creating the digest picture is a single frame (therefore static) meets the 
static image data. Furthermore, the static images of materials used in a presentation 
shown in the video data is met by the "frame which correspond to the inter-shot 
transition of the television program" above. 

It would have obvious to one of ordinary skill in the art at the time of the invention 
to combine the limitation of claims 21-31 as taught by Nagasaka with the teachings of 
Saunders et al. for the same motivation as discussed in claims 1-12 above. 

Conclusion 

7. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to GELEK TOPGYAL whose telephone number is 
(571 )272-8891 . The examiner can normally be reached on 8:30am -5:00pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Peter-Anthony Pappas can be reached on 571-272-7646. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
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Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



/Gelek Topgyal/ 
Examiner, Art Unit 2621 

/Peter-Anthony Pappas/ 

Supervisory Patent Examiner, Art Unit 2621 



